Skip to content

fix(rest-api): Check IP Usage for requested Prefix before creating or updating Instance#2347

Open
hwadekar-nv wants to merge 5 commits into
NVIDIA:mainfrom
hwadekar-nv:feat/check-ip-usage
Open

fix(rest-api): Check IP Usage for requested Prefix before creating or updating Instance#2347
hwadekar-nv wants to merge 5 commits into
NVIDIA:mainfrom
hwadekar-nv:feat/check-ip-usage

Conversation

@hwadekar-nv

@hwadekar-nv hwadekar-nv commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Description

  • Validate if IP exhausted for Subnet or VpcPrefix before creating or updating Instance

Type of Change

  • Fix - Bug fixes

Testing

  • Unit tests added/updated

Additional Notes

@hwadekar-nv hwadekar-nv self-assigned this Jun 9, 2026
@hwadekar-nv hwadekar-nv requested a review from a team as a code owner June 9, 2026 23:01
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 3da6824c-3078-4a94-9e55-bf2b06cffe8d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

Adds batched prefix-usage DAOs for Subnet and VpcPrefix, updates Instance create/update handlers to batch usage lookups and reject requests when requested (or net-new) interfaces would exhaust AvailableIPs, updates Get handlers to use batched usage, and adds tests/fixtures for exhausted Subnet and VPC-prefix scenarios.

Changes

IP-exhaustion validation and DAO batching

Layer / File(s) Summary
Batched Subnet/VpcPrefix GetPrefixUsage DAO
rest-api/db/pkg/db/model/subnet.go, rest-api/db/pkg/db/model/vpcprefix.go
GetPrefixUsage refactored to accept variadic ...*Subnet/...*VpcPrefix and return map[uuid.UUID]*cipam.Usage; adds CIDR helpers, batched interface-IP queries, and per-ID usage computation.
Instance create/update: batched usage lookup and exhaustion checks
rest-api/api/pkg/api/handler/instance.go
Handlers build per-request subnet/VPC-prefix reference counts, batch-fetch prefix usage via DAOs, compute incoming-interface deltas (creation vs. update), and return HTTP 400 when Subnet: AcquiredIPs + incoming > AvailableIPs or VPC Prefix: AcquiredIPs + incoming*2 > AvailableIPs. DAO failures produce HTTP 500.
Get handlers: batch usage population for Subnet and VpcPrefix
rest-api/api/pkg/api/handler/subnet.go, rest-api/api/pkg/api/handler/vpcprefix.go
GetAll and single-Get handlers collect eligible prefixes, call batched GetPrefixUsage once, populate usage maps from returned per-ID map, and error when expected entries are missing for single-get.
Tests: deterministic subnet prefixes and exhaustion fixtures
rest-api/api/pkg/api/handler/instance_test.go
testInstanceBuildSubnet now sets deterministic IPv4Prefix and PrefixLength=24. Adds exhaust fixture instance-type/allocation-constraint, exhausted /28 Subnet and /28 VPC Prefix fixtures populated to exhaust IPs, and two subtests asserting HTTP 400 with explicit "IP addresses ... are exhausted" messages.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title precisely describes the main change: adding IP usage validation for requested prefixes before creating or updating instances, which is the primary focus of the changeset.
Description check ✅ Passed The description is directly related to the changeset, clearly stating the intent to validate IP exhaustion for Subnets or VpcPrefixes before instance operations.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@hwadekar-nv hwadekar-nv force-pushed the feat/check-ip-usage branch from 55d3f34 to bc8f89c Compare June 9, 2026 23:03
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

🔐 TruffleHog Secret Scan

No secrets or credentials found!

Your code has been scanned for 700+ types of secrets and credentials. All clear! 🎉

🔗 View scan details

🕐 Last updated: 2026-06-09 23:05:08 UTC | Commit: 55d3f34

@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

🔍 Container Scan Summary

Service Total Critical High Medium Low Other
nico-flow 126 14 54 44 5 9
nico-nsm 133 11 45 66 11 0
nico-psm 128 14 56 44 5 9
nico-rest-api 192 17 88 70 8 9
nico-rest-cert-manager 105 6 51 35 4 9
nico-rest-db 126 14 54 44 5 9
nico-rest-site-agent 125 14 54 44 4 9
nico-rest-site-manager 112 7 52 40 4 9
nico-rest-workflow 128 14 56 44 5 9
TOTAL 1175 111 510 431 51 72

Per-CVE detail lives in the per-service grype-* artifacts (JSON + SARIF). Severity counts only — no CVE IDs published here.

@hwadekar-nv hwadekar-nv force-pushed the feat/check-ip-usage branch from bc8f89c to d095d7e Compare June 9, 2026 23:19

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
rest-api/api/pkg/api/handler/instance_test.go (2)

3464-3521: ⚡ Quick win

Add mirrored IP-exhaustion tests for UpdateInstanceHandler.

This change set adds create-path exhaustion assertions, but the PR scope also updates update-path exhaustion validation. Add matching update subtests (Subnet exhausted, VPC Prefix exhausted) to keep regression coverage symmetrical.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/handler/instance_test.go` around lines 3464 - 3521, The
test suite added create-path IP-exhaustion cases but missed equivalent
update-path tests; add two subtests in instance_test.go that mirror the create
cases but exercise UpdateInstanceHandler (use APIInstanceUpdateRequest payloads
or the existing update test harness, reference UpdateInstanceHandler / test
names like "test Instance update API endpoint failed when subnet IP addresses
are exhausted" and "test Instance update API endpoint failed when VPC Prefix IP
addresses are exhausted"), using the same fixtures (subnetExhausted,
vpcPrefixExhausted, tn1, vpc9, tnu1, tnOrg) and asserting http.StatusBadRequest
with respMessage matching the formatted messages used in the create tests so
regression coverage is symmetric.

910-913: ⚡ Quick win

Use range-over-integer loops for counting fixture rows.

Lines 910 and 1183 use C-style counting loops where range can express intent directly and matches repository convention.

Suggested fix
-	for i := 0; i < 14; i++ {
+	for i := range 14 {
 		exhaustInst := testInstanceBuildInstance(t, dbSession, fmt.Sprintf("exhaust-subnet-inst-%d", i), tn1.ID, ip.ID, st1.ID, &istExhaustFixture.ID, vpc1.ID, nil, &os1.ID, nil, cdbm.InstanceStatusReady)
 		testInstanceBuildInstanceInterface(t, dbSession, exhaustInst.ID, &subnetExhausted.ID, nil, nil, cdbm.InterfaceStatusPending)
 	}
...
-	for i := 0; i < 8; i++ {
+	for i := range 8 {
 		exhaustInst := testInstanceBuildInstance(t, dbSession, fmt.Sprintf("exhaust-vpcprefix-inst-%d", i), tn1.ID, ip.ID, st1.ID, &istExhaustFixture.ID, vpc9.ID, nil, &os1.ID, nil, cdbm.InstanceStatusReady)
 		testInstanceBuildInstanceInterface(t, dbSession, exhaustInst.ID, nil, &vpcPrefixExhausted.ID, nil, cdbm.InterfaceStatusPending)
 	}

As per coding guidelines, counting loops in rest-api/**/*.go should prefer for i := range n over C-style index loops.

Also applies to: 1183-1186

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/handler/instance_test.go` around lines 910 - 913,
Replace the C-style counting loops with range-over-integer loops: change the for
i := 0; i < 14; i++ loops (the ones that call testInstanceBuildInstance and
testInstanceBuildInstanceInterface using variable i and subnetExhausted) to use
a range over a fixed-length container (e.g., for i := range make([]struct{},
14)) so it matches repository convention; apply the same change to the second
occurrence around the block that begins at the other referenced lines (the loop
that also builds fixtures and calls testInstanceBuildInstanceInterface).

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 777-781: The call to testInstanceSiteBuildAllocationContraints
that creates the exhaustion fixture constraint is currently ignoring its return
value; capture the returned value (e.g., constraint or result) from
testInstanceSiteBuildAllocationContraints(al1,
cdbm.AllocationResourceTypeInstanceType, istExhaustFixture.ID,
cdbm.AllocationConstraintTypeReserved, 30, ipu) and add an explicit assertion
(e.g., assert.NotNil / assert.NoError) immediately after to fail fast if fixture
creation fails, referencing istExhaustFixture and the allocation parameters so
the test cannot proceed on a bad setup.

In `@rest-api/api/pkg/api/handler/instance.go`:
- Around line 435-447: The code calls sbDAO.GetPrefixUsage on subnets (loop over
subnets / variable subnets[i], storing subnetUsage) before performing
tenant/site/VPC ownership and scope checks, which leaks allocation state across
tenants and does unnecessary DAO work; update the logic so GetPrefixUsage is
invoked only after the per-interface ownership/readiness/scope validation for
that Subnet/VPC Prefix has passed (i.e., after the existing ownership checks in
the per-interface validation path), move the subnetUsage/exhaustion checks from
the initial subnets loop into that per-interface validation, and add a simple
in-memory cache keyed by Subnet ID (or VPC prefix ID) to avoid duplicate
sbDAO.GetPrefixUsage calls for the same ID during one request (reference
sbDAO.GetPrefixUsage, subnetUsage, subnets[i], and the per-interface validation
code paths).

---

Nitpick comments:
In `@rest-api/api/pkg/api/handler/instance_test.go`:
- Around line 3464-3521: The test suite added create-path IP-exhaustion cases
but missed equivalent update-path tests; add two subtests in instance_test.go
that mirror the create cases but exercise UpdateInstanceHandler (use
APIInstanceUpdateRequest payloads or the existing update test harness, reference
UpdateInstanceHandler / test names like "test Instance update API endpoint
failed when subnet IP addresses are exhausted" and "test Instance update API
endpoint failed when VPC Prefix IP addresses are exhausted"), using the same
fixtures (subnetExhausted, vpcPrefixExhausted, tn1, vpc9, tnu1, tnOrg) and
asserting http.StatusBadRequest with respMessage matching the formatted messages
used in the create tests so regression coverage is symmetric.
- Around line 910-913: Replace the C-style counting loops with
range-over-integer loops: change the for i := 0; i < 14; i++ loops (the ones
that call testInstanceBuildInstance and testInstanceBuildInstanceInterface using
variable i and subnetExhausted) to use a range over a fixed-length container
(e.g., for i := range make([]struct{}, 14)) so it matches repository convention;
apply the same change to the second occurrence around the block that begins at
the other referenced lines (the loop that also builds fixtures and calls
testInstanceBuildInstanceInterface).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c21bacd8-6fb3-4938-8fae-d0e332fa9d4d

📥 Commits

Reviewing files that changed from the base of the PR and between f7b232c and bc8f89c.

📒 Files selected for processing (2)
  • rest-api/api/pkg/api/handler/instance.go
  • rest-api/api/pkg/api/handler/instance_test.go

Comment thread rest-api/api/pkg/api/handler/instance_test.go
Comment on lines +435 to +447
for i := range subnets {
subnetIDMap[subnets[i].ID] = &subnets[i]

// Check if subnet prefix is exhausted
subnetUsage, err := sbDAO.GetPrefixUsage(ctx, nil, &subnets[i])
if err != nil {
logger.Error().Err(err).Msg("error getting prefix usage for Subnet")
return cutil.NewAPIErrorResponse(c, http.StatusInternalServerError, "Failed to get prefix usage for Subnet", nil)
}
if subnetUsage != nil && subnetUsage.AvailableIPs > 0 && subnetUsage.AcquiredIPs >= subnetUsage.AvailableIPs {
logger.Warn().Msg(fmt.Sprintf("Ip Addresses for Subnet ID: %v specified in interfaces data in request are exhausted", subnets[i].ID))
return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, fmt.Sprintf("Ip Addresses for Subnet ID: %v specified in interfaces data in request are exhausted", subnets[i].ID), nil)
}

@coderabbitai coderabbitai Bot Jun 9, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Run exhaustion checks after ownership/scope validation to avoid cross-tenant state disclosure.

At Line 439, Line 463, Line 2317, and Line 2340, GetPrefixUsage is called before tenant/site/VPC ownership checks. That allows different responses for foreign IDs (e.g., “exhausted” vs. “not owned”), leaking allocation state across tenants and doing unnecessary DAO work on resources that should be rejected first.

Move exhaustion checks into the per-interface validation path, after ownership/readiness/scope checks pass for that specific Subnet/VPC Prefix (optionally cache usage per ID to avoid duplicate DAO calls).

Also applies to: 459-471, 2316-2326, 2339-2349

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@rest-api/api/pkg/api/handler/instance.go` around lines 435 - 447, The code
calls sbDAO.GetPrefixUsage on subnets (loop over subnets / variable subnets[i],
storing subnetUsage) before performing tenant/site/VPC ownership and scope
checks, which leaks allocation state across tenants and does unnecessary DAO
work; update the logic so GetPrefixUsage is invoked only after the per-interface
ownership/readiness/scope validation for that Subnet/VPC Prefix has passed
(i.e., after the existing ownership checks in the per-interface validation
path), move the subnetUsage/exhaustion checks from the initial subnets loop into
that per-interface validation, and add a simple in-memory cache keyed by Subnet
ID (or VPC prefix ID) to avoid duplicate sbDAO.GetPrefixUsage calls for the same
ID during one request (reference sbDAO.GetPrefixUsage, subnetUsage, subnets[i],
and the per-interface validation code paths).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hwadekar-nv, confirmed. The exhaustion checks in both CreateInstanceHandler and UpdateInstanceHandler are now correctly positioned after the ownership, readiness, and VPC-scope guards for each Subnet and VPC Prefix — eliminating the cross-tenant state disclosure risk.

@hwadekar-nv hwadekar-nv force-pushed the feat/check-ip-usage branch 5 times, most recently from daca9d7 to 5658735 Compare June 10, 2026 20:43
@hwadekar-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@github-actions

Copy link
Copy Markdown

@hwadekar-nv

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

vpUsage := vpcPrefixUsageMap[vpcPrefixID]
if vpUsage != nil && vpUsage.AvailableIPs > 0 && vpUsage.AcquiredIPs+uint64(incomingInterfaceIPs)*2 > vpUsage.AvailableIPs {
logger.Warn().Msg(fmt.Sprintf("Ip Addresses for VPC Prefix ID: %v specified in interfaces data in request are exhausted", vpcPrefixID))
return cutil.NewAPIErrorResponse(c, http.StatusBadRequest, fmt.Sprintf("Ip Addresses for VPC Prefix ID: %v specified in interfaces data in request are exhausted", vpcPrefixID), nil)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider more specific info in the error message:

  • the number of IP addresses remaining in the VPC prefix
  • the number of interfaces specified in the request (N)
  • the number of IP addresses required to create those specified interfaces (2N)

That would help the user better define their interfaces in a way that makes use of the remaining IP space in the prefix if they wish to do so.

Something similar might be helpful for subnets, but those of course are more straightforward in that one interface uses one IP address.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nvlitagaki , since we are raising error for IP Address exhaust, the number of IP addresses required to create those specified interfaces (2N) could be helpful or just return the current usage along with it? cc @thossain-nv

@hwadekar-nv hwadekar-nv force-pushed the feat/check-ip-usage branch from 5658735 to 85369bf Compare June 10, 2026 21:26
Comment thread rest-api/db/pkg/db/model/subnet.go Outdated
}

idb := db.GetIDB(tx, ssd.dbSession)
ifcCounts, ifcIPs, err := queryEthernetInterfaceIPsForSubnets(ctx, idb, subnetIDs)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryEthernetInterfaceIPsForSubnets is only being called once, let's inline this.

Comment thread rest-api/db/pkg/db/model/subnet.go Outdated
// queryEthernetInterfaceIPsForSubnet returns iface row count and, for interfaces with IPs,
// each interface's assigned addresses. COUNT(*) equals len(rows) for the same join/filter on one SELECT.
func queryEthernetInterfaceIPsForSubnet(ctx context.Context, idb bun.IDB, subnetID uuid.UUID) (ifaceRows int64, ipStrings [][]string, err error) {
func subnetIPv4CIDR(sn *Subnet) (string, bool) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a GetIPv4CIDR receiver function of Subnet. It should return *string, nil if we can't formulate the CIDR. Let's avoid using free hanging functions which is a telltale sign of AI generated code.

Comment thread rest-api/db/pkg/db/model/subnet.go Outdated
return nil, err
}

func subnetPrefixUsageFromInterfaces(ctx context.Context, cidr string, ifcCount int64, ips [][]string) (*cipam.Usage, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need ifcCount, it should be the same is len(ips). Also why is ips array of array? Let's flatten them.

Comment thread rest-api/db/pkg/db/model/vpcprefix.go Outdated
// and, for each row with assigned IPs, a slice of that interface's IPv4 addresses.
// One SELECT suffices: COUNT(*) equals the number of result rows given the same join/filter.
func queryEthernetInterfaceIPsForVPCPrefix(ctx context.Context, idb bun.IDB, vpcPrefixID uuid.UUID) (ifaceRows int64, ipStrings [][]string, err error) {
func vpcPrefixCIDR(vp *VpcPrefix) (string, bool) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above.

@hwadekar-nv hwadekar-nv force-pushed the feat/check-ip-usage branch from 689b522 to 1948819 Compare June 11, 2026 00:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants